This section documents the results of regression analyses aimed at identifying which variables vary by region, race, and/or age to determine the combination of attributes on which to stratify each parameter. The first step looks at associations by age after adjusting for race/ethnicity and region to determine whether the data show any age effects and, if so, how age should be parameterized. After fitting a model adjusted for the main effects of race/ethnicity, region, and age (categorized as a factor variable with categories 18-24, 25-34… 54-59), we plot the coefficients for age and examine the pattern. Error bars on these plots show the 95% confidence interval around the estimates (1.96 times the standard error of the estimate). If the plots suggest a meaningful trend or other pattern by age, we re-run the models including age in the form indicated by the plot. If not, we fit models excluding age to examine the associations with race and region. We first examine main effects and then test the significance of all two-way interactions.

P-values are to be used as a guide but not as the sole criteria determining which differences may be worth modeling. The purpose of this analysis is to determine how to specify parameters in order to get the network to reproduce the data. It is not an analysis to determine from the data which parameters differ significantly by region, race, or age in the population (since the sample is not probability-based, this would be difficult to conclude even if it were our goal). As such, corrections for multiple comparisons are not necessary, as these adjust the false discovery rate in analyses attempting to make inference about the population. We use likelihood ratio tests (package lmtest) to compare models for joint significance testing.

For variables from which the models suggest meaningful heterogeneity by age, race/ethnicity, and/or region, we further explore this heterogeneity by plotting the data stratified by the indicated nodal attributes. These plots are used to inform how, or if at all, to represent the observed heterogeneity in the model.

Regression analyses

Degree distribution

To look for differences in the degree distribution, we will break it down to four binary indicators corresponding to: 1) whether men have any ongoing main partners, 2) whether they have any ongoing persistent partners, 3) whether they have two or more ongoing persistent partners, and 4) whether men who reported an ongoing main partner reported any concurrent persistent partners.

Any ongoing main partner

We first look at the number of observations that are non-missing for this outcome and for model predictors

## 
## Hispanic    Black    Other 
##      176       46      677
## 
## 18-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 
##   250   185   140    87    66    66    53    52
## 
## King County  Western WA  Eastern WA 
##         534         240         125

Next we will look at how to parameterize age by plotting the ceofficients for age from the model adjusting for age, race, and region, with age input as a dummy variable. The error bars on these plots correspond to 1.96 times the standard error on either side of the point estimate.

From this there does not appear to be any significant pattern by age, so we will not represent heterogeneity by age. The models below test the effects of race/ethnicity and region in models that do not adjust for age.

## [[1]]
##   Variable pvalue  sig
## 1   Region 0.0140    x
## 2     Race 0.0188    x
## 3      Age     NA <NA>
## 
## [[2]]
##         Variable pvalue  sig
## 1 Race by region 0.0432    x
## 2  Age by region     NA <NA>
## 3    Race by age     NA <NA>

Any ongoing persistent partners

We first look at the number of observations that are non-missing for this outcome and for model predictors

## 
## Hispanic    Black    Other 
##      176       46      667
## 
## 18-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 
##   247   184   138    86    66    65    52    51
## 
## King County  Western WA  Eastern WA 
##         529         238         122

Next we will look at how to parameterize age by plotting the ceofficients for age from the model adjusting for age, race, and region, with age input as a dummy variable. The error bars on these plots correspond to 2 times the standard error on either side of the point estimate.

The data suggest that the likelihood of having any persistent partner is higher for men aged 40-44 and 45-49 relative to men aged 18-24. There is otherwise not a clear trend by age, and these elevated probabilities for men aged 40-49 do not align with our understanding of sexual behavior throughout the lifecourse. However, because there is a strong signal from these data, and the pattern is consistent for the outcomes of having two or more persistent partners and having a persistent partner among those with a main partner (below), we will explore the effect of modeling a higher probability of forming persistent partnerships for men aged 40-49 in sensitivity analyses. In the models below, we test the effects of race/ethnicity and region without adjustment for age, and separately in models adjusting for age as a binary indicator for being aged 40-49 vs all other ages.

## [[1]]
##   Variable pvalue  sig
## 1   Region 0.5143     
## 2     Race 0.4046     
## 3      Age     NA <NA>
## 
## [[2]]
##         Variable pvalue  sig
## 1 Race by region 0.1101     
## 2  Age by region     NA <NA>
## 3    Race by age     NA <NA>
## [[1]]
##   Variable pvalue sig
## 1   Region 0.6485    
## 2     Race 0.5212    
## 3      Age 0.0002 xxx
## 
## [[2]]
##         Variable pvalue sig
## 1 Race by region 0.1043    
## 2  Age by region 0.5473    
## 3    Race by age 0.3519

Two or more ongoing persistent partners

We first look at the number of observations that are non-missing for this outcome and for model predictors

## 
## Hispanic    Black    Other 
##      176       46      667
## 
## 18-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 
##   247   184   138    86    66    65    52    51
## 
## King County  Western WA  Eastern WA 
##         529         238         122

Next we will look at how to parameterize age by plotting the ceofficients for age from the model adjusting for age, race, and region, with age input as a dummy variable. The error bars on these plots correspond to 2 times the standard error on either side of the point estimate.

The pattern of concurrent persistent partnerships by age is similar to the pattern for having any persistent partnership, with a higher likelihood among men aged 40-44 and 45-49 relative to men aged 18-24. Because there is a strong signal from these data and the pattern is consistent for the outcomes of having any persistent partners and having a persistent partner among those with a main partner (above and below), we will explore the effect of modeling a higher probability of forming persistent partnerships for men aged 40-49 in sensitivity analyses. In the models below, we test the effects of race/ethnicity and region without adjustment for age, and separately in models adjusting for age as a binary indicator for being aged 40-49 vs all other ages.

## [[1]]
##   Variable pvalue  sig
## 1   Region 0.2597     
## 2     Race 0.0981     
## 3      Age     NA <NA>
## 
## [[2]]
##         Variable pvalue  sig
## 1 Race by region 0.6299     
## 2  Age by region     NA <NA>
## 3    Race by age     NA <NA>
## [[1]]
##   Variable pvalue sig
## 1   Region 0.3263    
## 2     Race 0.1225    
## 3      Age 0.0013  xx
## 
## [[2]]
##         Variable pvalue sig
## 1 Race by region 0.5333    
## 2  Age by region 0.1120    
## 3    Race by age 0.3326

Ongoing main partners with concurrent persistent partners

We first look at the number of observations that are non-missing for this outcome and for model predictors

## 
## Hispanic    Black    Other 
##       99       18      298
## 
## 18-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 
##   107    95    74    34    31    34    21    19
## 
## King County  Western WA  Eastern WA 
##         268          94          53

Next we will look at how to parameterize age by plotting the ceofficients for age from the model adjusting for age, race, and region, with age input as a dummy variable. The error bars on these plots correspond to 2 times the standard error on either side of the point estimate.

It appears that the probability of having persistent partners concurrently with a main partner is higher for men aged 45-49. Although this result is unexpected and doesn’t fit in with a larger pattern, it is consistent with the results for having any persistent or concurrent persistent partners, above. As such, we will explore the effect of modeling a higher probability of forming persistent partnerships concurrent with main partners for men aged 40-49 in sensitivity analyses. We use the full decade 40-49 to align with the approaches taken for the models above, and because the number of men aged 45-49 is small. In the models below, we test the effects of race/ethnicity and region without adjustment for age, and separately in models adjusting for age as a binary indicator for being aged 40-49 vs all other ages.

## [[1]]
##   Variable pvalue  sig
## 1   Region 0.6180     
## 2     Race 0.1837     
## 3      Age     NA <NA>
## 
## [[2]]
##         Variable pvalue  sig
## 1 Race by region 0.3352     
## 2  Age by region     NA <NA>
## 3    Race by age     NA <NA>
## [[1]]
##   Variable pvalue sig
## 1   Region 0.6017    
## 2     Race 0.2957    
## 3      Age 0.0122   x
## 
## [[2]]
##         Variable pvalue sig
## 1 Race by region 0.1715    
## 2  Age by region 0.2167    
## 3    Race by age 0.0684

Rate of instantaneous partners

This analysis looks at heterogeneity in the rate of one-time partnerships separately for men who have 0 main, those who have 1 main, those who have 0 persistent, and those who have one or more persistent partnerships. ### No main partners We first look at the number of observations that are non-missing for this outcome and for model predictors

## 
## Hispanic    Black    Other 
##       77       27      372
## 
## 18-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 
##   142    87    64    53    35    31    32    32
## 
## King County  Western WA  Eastern WA 
##         261         143          72

Next we will look at how to parameterize age by plotting the ceofficients from the full model with age input as a dummy variable with 5-year age bins

This plot suggests that the rate of instantaneous partners is higher for men aged 35-39 and subsequently decreases with age. The point for ages 30-34 does not fit with a trend suggesting that the rate increases with age up through age 39, but it may be an outlier. To further explore the association with age, we will include age as a dummy variable, grouping together men in age groups 40-49 and 50-59. The models and likelihood ratio tests are below.

## [[1]]
##   Variable pvalue sig
## 1   Region 0.6042    
## 2     Race 0.7896    
## 3      Age 0.0246   x
## 
## [[2]]
##         Variable pvalue sig
## 1 Race by region 0.6736    
## 2  Age by region 0.8754    
## 3    Race by age 0.2738

Main partner

We first look at the number of observations that are non-missing for this outcome and for model predictors

## 
## Hispanic    Black    Other 
##       99       18      299
## 
## 18-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 
##   107    95    74    34    31    35    21    19
## 
## King County  Western WA  Eastern WA 
##         269          94          53

Next we will look at how to parameterize age by plotting the ceofficients from the full model with age input as a dummy variable with 5-year age bins

As for men with no main partners, this plot suggests that the rate of instantaneous partners increases with age through age 39 and subsequently decreases. Consistent with the approach taken above, we will include age as a dummy variable, grouping together men in age groups 40-49 and 50-59.

The models and likelihood ratio tests are below.

## [[1]]
##   Variable pvalue sig
## 1   Region 0.3919    
## 2     Race 0.1614    
## 3      Age 0.0451   x
## 
## [[2]]
##         Variable pvalue sig
## 1 Race by region 0.7025    
## 2  Age by region 0.8731    
## 3    Race by age 0.2770

No persistent partners

We first look at the number of observations that are non-missing for this outcome and for model predictors

## 
## Hispanic    Black    Other 
##      141       33      509
## 
## 18-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 
##   201   147   100    69    44    39    42    41
## 
## King County  Western WA  Eastern WA 
##         400         185          98

Next we will look at how to parameterize age by plotting the ceofficients from the full model with age input as a dummy variable with 5-year age bins

The pattern here is a bit less clear than above, but it suggests that the rate of instantaneous partners increases through age 39 and subsequently decreases with age. Although only the coefficient for ages 35-39 is significant, we will further explore the association with age, including age as a dummy variable, grouping together men in age groups 40-49 and 50-59.

The models and likelihood ratio tests are below.

## [[1]]
##   Variable pvalue sig
## 1   Region 0.8458    
## 2     Race 0.7943    
## 3      Age 0.0138   x
## 
## [[2]]
##         Variable pvalue sig
## 1 Race by region 0.7729    
## 2  Age by region 0.8759    
## 3    Race by age 0.9801

Any persistent partners

We first look at the number of observations that are non-missing for this outcome and for model predictors

## 
## Hispanic    Black    Other 
##       34       12      157
## 
## 18-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 
##    46    35    38    16    22    26    10    10
## 
## King County  Western WA  Eastern WA 
##         127          52          24

Next we will look at how to parameterize age by plotting the ceofficients from the full model with age input as a dummy variable with 5-year age bins

The pattern here is even less clear than stratified by main partner status, but it suggests that the rate of instantaneous partners increases through age 39 and subsequently decreases with age. Although only the coefficient for ages 35-39 is significant, we will further explore the association with age, including age as a dummy variable, grouping together men in age groups 40-49 and 50-59 to be consistent with the approach taken above.

The models and likelihood ratio tests are below.

## [[1]]
##   Variable pvalue sig
## 1   Region 0.0870    
## 2     Race 0.6340    
## 3      Age 0.2077    
## 
## [[2]]
##         Variable pvalue sig
## 1 Race by region 0.1968    
## 2  Age by region 0.6692    
## 3    Race by age 0.2042

Race/ethnicity mixing

To examine differences in mixing by race/ethnicity, we will look at the proportion of men reporting that their most recent partner was in the same racial/ethnic group (Hispanic, black, or other) as themselves. For main and persistent partner types, this analysis is restricted to most recent partners who are ongoing. Note: The outcome is based on ego reports of their partners race and, at this point, does not adjust for imbalances in reported partnering patterns.

Main partners

We first look at the number of observations that are non-missing for this outcome and for model predictors

## 
## Hispanic    Black    Other 
##       73       11      256
## 
## 18-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 
##    83    85    62    27    23    29    19    12
## 
## King County  Western WA  Eastern WA 
##         224          72          44

Next we will look at how to parameterize age by plotting the ceofficients from the full model with age input as a dummy variable with 5-year age bins. All men in age group 55-59 reported a partner of the same race/ethnicity, making this estimate unstable. Because this estimate and it’s confidence interval are much larger than the estimates and intervals for other age groups, the plot including this point masks variability. As such, the second plot excludes estimates for the 55-59 year-old group.

These coefficients suggest that racial/ethnic homophily is lower among those aged 50-54 and higher among those aged 40-44 and 55-59, though these latter two estimates are not statistically significant. This does not suggest any meaningful trend, so we will not include age in the models below to test for differences by region.

## [[1]]
##   Variable pvalue  sig
## 1   Region 0.0711     
## 2     Race 0.0000  xxx
## 3      Age     NA <NA>
## 
## [[2]]
##         Variable pvalue  sig
## 1 Race by region 0.1654     
## 2  Age by region     NA <NA>
## 3    Race by age     NA <NA>

Persistent partners

We first look at the number of observations that are non-missing for this outcome and for model predictors

## 
## Hispanic    Black    Other 
##       17        6       77
## 
## 18-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 
##    21    15    23     7    14    11     6     3
## 
## King County  Western WA  Eastern WA 
##          62          26          12

Next we will look at how to parameterize age by plotting the ceofficients from the full model with age input as a dummy variable with 5-year age bins. As with main partnerships, all men in age group 55-59 reported a persistent partner of the same race/ethnicity, making this estimate unstable. Because this estimate and it’s confidence interval are much larger than the estimates and intervals for other age groups, the plot including this point masks variability. As such, the second plot excludes estimates for the 55-59 year-old group.

As with main partnerships, there does not appear to be a meaningful pattern by age. As such, the models below do not adjust for age.

## [[1]]
##   Variable pvalue  sig
## 1   Region 0.5807     
## 2     Race 0.0000  xxx
## 3      Age     NA <NA>
## 
## [[2]]
##         Variable pvalue  sig
## 1 Race by region 0.8024     
## 2  Age by region     NA <NA>
## 3    Race by age     NA <NA>

Instantaneous partners

We first look at the number of observations that are non-missing for this outcome and for model predictors

## 
## Hispanic    Black    Other 
##       35        8      114
## 
## 18-24 25-29 30-34 35-39 40-49 50-59 
##    45    33    19    20    21    19
## 
## King County  Western WA  Eastern WA 
##          85          41          31

Next we will look at how to parameterize age by plotting the ceofficients from the full model with age input as a dummy variable with 5-year age bins

This plot does not show any clear pattern of racial/ethnic mixing by age. As such, we will exclude age from the models below in examining associations with region adjusting for race/ethnicity.

## [[1]]
##   Variable pvalue  sig
## 1   Region   0.29     
## 2     Race   0.00  xxx
## 3      Age     NA <NA>
## 
## [[2]]
##         Variable pvalue  sig
## 1 Race by region  0.368     
## 2  Age by region     NA <NA>
## 3    Race by age     NA <NA>

Age mixing

As with race/ethnicity, examination of age differences with main and persistent partners is restricted to most recent partners who are ongoing. For this analysis, we model age mixing using a linear model with the outcome being the absolute difference between the square root of the ego and alter’s ages.

Main partners

We first look at the number of observations that are non-missing for this outcome and for model predictors

## 
## Hispanic    Black    Other 
##       70       11      255
## 
## 18-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 
##    81    82    62    27    23    29    19    13
## 
## King County  Western WA  Eastern WA 
##         221          72          43

Next we will look at how to parameterize age by plotting the ceofficients from the full model with age input as a dummy variable with 5-year age bins

With the exception of the coefficients for ages 50-54 and 55-59, these appear close to linear. The coefficients for these older groups suggest that a linear parameterization may not be appropriate, however. Although ages 50-54 and 55-59 appear quite different, there are only 19 and 17 respectively in each bin, so we will model age as a categorical variable, collapsing ages 40-59 into 10-year bins.

## [[1]]
##   Variable pvalue sig
## 1   Region 0.2341    
## 2     Race 0.0151   x
## 3      Age 0.0000 xxx
## 
## [[2]]
##         Variable pvalue sig
## 1 Race by region 0.1525    
## 2  Age by region 0.1244    
## 3    Race by age 0.0064  xx

Persistent partners

We first look at the number of observations that are non-missing for this outcome and for model predictors

## 
## Hispanic    Black    Other 
##       18        6       77
## 
## 18-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 
##    23    15    22     7    14    11     6     3
## 
## King County  Western WA  Eastern WA 
##          62          26          13

Next we will look at how to parameterize age by plotting the ceofficients from the full model with age input as a dummy variable with 5-year age bins

With the exception of the estimate for ages 50-54, there does not appear to be a meaningful trend by age. However, to be consistent with the approach taken for main partners, we will model age as a categorical variable, collapsing ages 40-59 into 10-year bins. Although this groups together ages 50-54 and 55-59, for which the estimates appear different, there are only 6 and 4 men in each age group.

The models and likelihood ratio tests are below.

## [[1]]
##   Variable pvalue sig
## 1   Region 0.0123   x
## 2     Race 0.4145    
## 3      Age 0.0000 xxx
## 
## [[2]]
##         Variable pvalue sig
## 1 Race by region 0.9430    
## 2  Age by region 0.3272    
## 3    Race by age 0.6858

Instantaneous partners

We first look at the number of observations that are non-missing for this outcome and for model predictors

## Error in mrp_type_r %in% "One time": object 'mrp_type_r' not found
## Error in mrp_type_r %in% "One time": object 'mrp_type_r' not found
## Error in mrp_type_r %in% "One time": object 'mrp_type_r' not found

Next we will look at how to parameterize age by plotting the ceofficients from the full model with age input as a dummy variable with 5-year age bins

This plot suggests that the difference between the square root of ego and alter ages increases with age. To be consistent with the approach for main partnerships, and becuase the estimates for 30-34 and 50-59 do not fit in a linear pattern, we will model age as a categorical variables with ages 40-59 in 10-year bins.

The models and likelihood ratio tests are below.

## [[1]]
##   Variable pvalue sig
## 1   Region 0.3179    
## 2     Race 0.0726    
## 3      Age 0.0000 xxx
## 
## [[2]]
##         Variable pvalue sig
## 1 Race by region 0.6063    
## 2  Age by region 0.0078  xx
## 3    Race by age 0.0018  xx

Partnership age

Partnership age was modeled as a linear outcome.

Main partners

We first look at the number of observations that are non-missing for this outcome and for model predictors

## 
## Hispanic    Black    Other 
##       71       12      257
## 
## 18-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 
##    82    83    62    27    24    30    19    13
## 
## King County  Western WA  Eastern WA 
##         224          74          42

Next we will look at how to parameterize age by plotting the ceofficients from the full model with age input as a dummy variable with 5-year age bins

The coefficients suggest that partnership age increases with ego age. In the models below, we will put age in as a categorical variable with ages 40-59 in 10-year bins.

The models and likelihood ratio tests are below.

## [[1]]
##   Variable pvalue sig
## 1   Region 0.2652    
## 2     Race 0.2475    
## 3      Age 0.0000 xxx
## 
## [[2]]
##         Variable pvalue sig
## 1 Race by region 0.7893    
## 2  Age by region 0.0251   x
## 3    Race by age 0.5642

Persistent partners

We first look at the number of observations that are non-missing for this outcome and for model predictors

## 
## Hispanic    Black    Other 
##       18        6       78
## 
## 18-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 
##    22    15    23     7    14    12     6     3
## 
## King County  Western WA  Eastern WA 
##          64          26          12

Next we will look at how to parameterize age by plotting the ceofficients from the full model with age input as a dummy variable with 5-year age bins

The coefficients do not seem to indicate any meaningful pattern by age. As such, we will exclude age from the models below.

## [[1]]
##   Variable pvalue  sig
## 1   Region 0.8282     
## 2     Race 0.7284     
## 3      Age     NA <NA>
## 
## [[2]]
##         Variable pvalue  sig
## 1 Race by region 0.4822     
## 2  Age by region     NA <NA>
## 3    Race by age     NA <NA>

Proportion never tested

The parameter indicating the proportion of men who never test is meant to indicate the proportion of men who will not be screened for HIV. As documented in section @ref(hivtesting), it appears that if men do not test by age 40 they are unlikely to. As such, we will estimate the proportion of men who never test for HIV based on the proportion of men aged 40 and above who report never having tested. This analysis includes men who have taken PrEP, because many men who go on PrEP are likely to have come in for testing as the precipitating event to initiating PrEP. Excluding them would overestimate the proportion never tested. To see if this varies by race and region, we will model the probability of having never tested among men aged 40 to 59.

We first look at the number of observations that are non-missing for this outcome and for model predictors.

## 
## Hispanic    Black    Other 
##       33        9      229
## 
## 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 
## 16 11 11 21 13 14 19 14 17  9 15  7 17  8 13 19 13 13 15  6
## 
## King County  Western WA  Eastern WA 
##         162          74          35

Because this outcome is defined based on age, we will not include age as a covariate in these models.

## [[1]]
##   Variable pvalue  sig
## 1   Region 0.0076   xx
## 2     Race 0.4018     
## 3      Age     NA <NA>
## 
## [[2]]
##         Variable pvalue  sig
## 1 Race by region 0.1163     
## 2  Age by region     NA <NA>
## 3    Race by age     NA <NA>

Intertest interval

The last test interval is estimated as the days since men reported their last HIV test, assuming men test as an interval process (see Section @ref(iti). This analysis is restricted to men who reported having ever tested, and excludes men who reported use of PrEP in the past 12 months, as testing patterns on PrEP will be different and will be represented with a separate parameter in the model.

We first look at the number of observations that are non-missing for this outcome and for model predictors

## 
## Hispanic    Black    Other 
##      111       33      428
## 
## 18-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 
##   166   103    83    59    42    40    42    37
## 
## King County  Western WA  Eastern WA 
##         320         168          84

Next we will look at how to parameterize age by plotting the ceofficients from the full model with age input as a dummy variable with 5-year age bins

The coefficients suggest that the intertest interval increases with age, though the estimates do not follow a clear pattern. In the models below, we will group age into the following bins: 18-24, 25-34, 35+.

## [[1]]
##   Variable pvalue sig
## 1   Region 0.4709    
## 2     Race 0.5692    
## 3      Age 0.0000 xxx
## 
## [[2]]
##         Variable pvalue sig
## 1 Race by region 0.8764    
## 2  Age by region 0.8302    
## 3    Race by age 0.3563

Coital frequency

Data on coital frequency are from all most recent partnerships, not just those that are ongoing.

Main partners

We first look at the number of observations that are non-missing for this outcome and for model predictors

## 
## Hispanic    Black    Other 
##       74       13      270
## 
## 18-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 
##    92    84    64    29    23    30    21    14
## 
## King County  Western WA  Eastern WA 
##         232          81          44

Next we will look at how to parameterize age by plotting the ceofficients from the full model with age input as a dummy variable with 5-year age bins

From these coefficients, coital frequency appears to be higher for men aged 30-34 (though not significantly so), and then to decrease with age. In subsequent analyses, we will include age as a dummy variable with ages 40-59 in 10-year bins.

## [[1]]
##   Variable pvalue sig
## 1   Region 0.0014  xx
## 2     Race 0.6084    
## 3      Age 0.2177    
## 
## [[2]]
##         Variable pvalue sig
## 1 Race by region 0.5558    
## 2  Age by region 0.1565    
## 3    Race by age 0.9343

Persistent partners

We first look at the number of observations that are non-missing for this outcome and for model predictors

## 
## Hispanic    Black    Other 
##       34       13      139
## 
## 18-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 
##    50    36    32    23    19    13     7     6
## 
## King County  Western WA  Eastern WA 
##         103          60          23

Next we will look at how to parameterize age by plotting the ceofficients from the full model with age input as a dummy variable with 5-year age bins

There does not appear to be a meaningful trend by age for coital frequency in persistent partnerships, so we will exclude age from subsequent models.

## [[1]]
##   Variable pvalue  sig
## 1   Region 0.7248     
## 2     Race 0.5029     
## 3      Age     NA <NA>
## 
## [[2]]
##         Variable pvalue  sig
## 1 Race by region 0.8757     
## 2  Age by region     NA <NA>
## 3    Race by age     NA <NA>

Sex role

Sex role is categorized as exclusively bottom, exclusively top, or versatile, based on men’s reported role in anal sex over the past 12 months. Responses of “mostly a bottom”, “mostly a top”, and “equally a bottom and a top” are categorized as versatile. For this analysis, dichotomous indicators was constructed to measure whether men report exclusively topping, exclusively bottoming, or a versatile sex role.

We first look at the number of observations that are non-missing for these outcomes and for model predictors

## 
## Hispanic    Black    Other 
##      170       40      614
## 
## 18-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 
##   228   180   127    81    59    66    42    41
## 
## King County  Western WA  Eastern WA 
##         490         215         119

Exclusively topping

To inform parameterization of age in models exploring heterogeneity in the proportion of men who report exclusively topping, we will plot the ceofficients from a model with age input as a dummy variable with 5-year age bins, adjusting for race/ethnicity and region.

Although the proportion of men reporting that they are exclusive tops seems to increase through age 44, it then fluctuates for older age groups. In subsequent models, we will represent age as a categorical variable with ages 40-59 binned in 10-year groups.

## [[1]]
##   Variable pvalue sig
## 1   Region 0.2446    
## 2     Race 0.0383   x
## 3      Age 0.0406   x
## 
## [[2]]
##         Variable pvalue sig
## 1 Race by region 0.4280    
## 2  Age by region 0.9384    
## 3    Race by age 0.0026  xx

Exclusively bottoming

As above, to inform parameterization of age in models exploring heterogeneity in the proportion of men who report exclusively bottoming, we will plot the ceofficients from a model with age input as a dummy variable with 5-year age bins, adjusting for race/ethnicity and region.

The proportion of men reporting that they are exclusive bottoms seems to drop for men aged 35-39 and then increase again. To further explore this effect, in subsequent models, we will represent age as a categorical variable with ages 40-59 binned in 10-year groups.

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## [[1]]
##   Variable pvalue sig
## 1   Region 0.8890    
## 2     Race 0.0049  xx
## 3      Age 0.0073  xx
## 
## [[2]]
##         Variable pvalue sig
## 1 Race by region 0.6190    
## 2  Age by region 0.1008    
## 3    Race by age 0.0151   x

Versatile

To inform parameterization of age in models exploring heterogeneity in the proportion of men who report being versatile in their anal sex role, we will plot the ceofficients from a model with age input as a dummy variable with 5-year age bins, adjusting for race/ethnicity and region.

These data do not suggest any clear trend by age. However, because sex role will be represented in the model using one 3-level variable for exclusively top, exclusively bottom, or versatile, we will include age in the models below as a categorical variable to be consistent with the approach taken for the indicators of exclusively topping and exclusively bottoming.

## [[1]]
##   Variable pvalue sig
## 1   Region 0.3840    
## 2     Race 0.0145   x
## 3      Age 0.4563    
## 
## [[2]]
##         Variable pvalue sig
## 1 Race by region 0.1857    
## 2  Age by region 0.1885    
## 3    Race by age 0.0040  xx

Condom use

Data on condom use are restricted to dyads in which both the respondent and his partner were HIV-negative or of unknown status. This analysis includes data from all such partnerships, not just those that were ongoing. Because the model will explore the effect of decreased condom use in men on PrEP, we will further restrict this analysis to men who have not taken PrEP in he past 12 months. We exclude all men who have taken PrEP in the past 12 months regardless of whether they are currently using it becasue their reported behaviors in the past 12 months may reflect behaviors while on PrEP. We will also exclude men who report that their most recent partner is taking PrEP.

Main partners

We first look at the number of observations that are non-missing for this outcome and for model predictors

## 
## Hispanic    Black    Other 
##       50       10      183
## 
## 18-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 
##    65    66    41    19    14    15    16     7
## 
## King County  Western WA  Eastern WA 
##         137          66          40

Next we will look at how to parameterize age by plotting the ceofficients from the full model with age input as a dummy variable with 5-year age bins

The plot of these coefficients suggests that condom use in main partnerships is lower for men aged 35-39 and 45-49. The number of respondents aged 40 and higher is somewhat small, so we will group ages 40-59 in 10-year bins in subsequent analyses.

## [[1]]
##   Variable pvalue sig
## 1   Region 0.6827    
## 2     Race 0.0960    
## 3      Age 0.0030  xx
## 
## [[2]]
##         Variable pvalue sig
## 1 Race by region 0.1163    
## 2  Age by region 0.0353   x
## 3    Race by age 0.2439

Persistent partners

We first look at the number of observations that are non-missing for this outcome and for model predictors

## 
## Hispanic    Black    Other 
##       25        7       63
## 
## 18-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 
##    33    16    15    10     6     7     5     3
## 
## King County  Western WA  Eastern WA 
##          43          35          17

Next we will look at how to parameterize age by plotting the ceofficients from the full model with age input as a dummy variable with 5-year age bins

These data do not suggest a clear pattern by age for condom use with persistent partners. We will not include age in models testing associations by race/ethnicity and region.

## [[1]]
##   Variable pvalue  sig
## 1   Region 0.9016     
## 2     Race 0.1525     
## 3      Age     NA <NA>
## 
## [[2]]
##         Variable pvalue  sig
## 1 Race by region 0.8454     
## 2  Age by region     NA <NA>
## 3    Race by age     NA <NA>

Instantaneous partners

We first look at the number of observations that are non-missing for this outcome and for model predictors

## 
## Hispanic    Black    Other 
##       26        3       72
## 
## 18-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 
##    32    20    10    16     4     6     7     6
## 
## King County  Western WA  Eastern WA 
##          48          30          23

Next we will look at how to parameterize age by plotting the ceofficients from the full model with age input as a dummy variable with 5-year age bins. All men aged 45-49 report condom use with instantaneous partners, so the coefficient for this group is high with a large standard error, masking the variability across other age groups. The second plot imposes restrictions on the y-axis, which results in dropping the point for ages 45-49.

## Warning: Removed 1 rows containing missing values (geom_pointrange).

The plots of these coefficients do not suggest that there is any meaningful pattern by age for the probability of condom use with instantaneous partners. As such, age will not be included in subsequent analyses.

## [[1]]
##   Variable pvalue  sig
## 1   Region 0.2067     
## 2     Race 0.0803     
## 3      Age     NA <NA>
## 
## [[2]]
##         Variable pvalue  sig
## 1 Race by region 0.1325     
## 2  Age by region     NA <NA>
## 3    Race by age     NA <NA>

PrEP use

To look at differences in PrEP use, we define two outcomes: reported current use of PrEP among those for whom PrEP is recommended, and reported current use of PrEP among those with whom PrEP should be discussed based on Washington State guidelines.1

PrEP use among men who should discuss PrEP with a provider

We first look at the number of observations that are non-missing for this outcome and for model predictors

## 
## Hispanic    Black    Other 
##       51       12      175
## 
## 18-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 
##    63    53    40    20    22    24     8     8
## 
## King County  Western WA  Eastern WA 
##         139          64          35

Next we will look at how to parameterize age by plotting the ceofficients from the full model with age input as a dummy variable with 5-year age bins

As among men recommended to initiate PrEP, this plot suggests that the probability of PrEP use among men who meet criteria for discussing PrEP is significantly higher for those aged 25-49 (and non-significantly higher for those aged 50-59) relative to those aged 18-24. Due to the small number of observations among men aged 40 and older, we will group ages 40-59 into 10-year age groups and include age as a factor variable.

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
## [[1]]
##   Variable pvalue sig
## 1   Region 0.4489    
## 2     Race 0.3094    
## 3      Age 0.0005 xxx
## 
## [[2]]
##         Variable pvalue sig
## 1 Race by region 0.0213   x
## 2  Age by region 0.0021  xx
## 3    Race by age 0.0107   x

Summary tables

In these tables, “NA” indicates that age was not included in the regression model based on examination of the coefficients.
Summary of regression model findings for associations with network model parameters
Main effects
Interactions
Parameter Region Race Age Race by region Age by region Race by age
Degree distribution
Ongoing main partner x x NA x NA NA
Any ongoing persistent partners NA NA NA
Any ongoing persistent partners (with age) xxx
Concurrent ongoing persistent partners NA NA NA
Concurrent ongoing persistent partners (with age) xx
Ongoing main with concurrent persistent partners NA NA NA
Ongoing main with concurrent persistent partners (with age) x
Rate of instantaneous partnerships
No main partners x
Main partner x
No persistent partners x
Persistent partners
Proportion of partnerships same race
Main partnerships xxx NA NA NA
Persistent partnerships xxx NA NA NA
Instantaneous partnerships xxx NA NA NA
Age mixing (sqrt of age difference)
Main partnerships x xxx xx
Persistent partnerships x xxx
Instantaneous partnerships xxx xx xx
Partnership age
Main partnerships xxx x
Persistent partnerships NA NA NA
1 x = p-value<0.05
2 xx = p-value<0.01
3 xxx = p-value<0.001
Summary of regression model findings for associations with epidemic model parameters
Main effects
Interactions
Parameter Region Race Age Race by region Age by region Race by age
Proportion never tested xx NA NA NA
Intertest interval xxx
Coital frequency
Main partnerships xx
Persistent partnerships NA NA NA
Anal sex role
Exclusive top x x xx
Exclusive bottom xx xx x
Versatile x xx
Condom use
Main partnerships xx x
Persistent partnerships NA NA NA
Instantaneous partnerships NA NA NA
PrEP use by WA state guidelines
Recommend x x xxx
Discuss xxx x xx x
1 x = p-value<0.05
2 xx = p-value<0.01
3 xxx = p-value<0.001

These analyses were intended as a guide to suggest which nodal attributes appear to shape patterns of partnership formation/persistence and behaviors related to transmission and diagnosis. For network model parameters, it appears that the probability of having a main partner differs by race/ethnicity, region, and the interaction of race/ethnicity and region, the probabilities of having any persistent and concurrent persistent partners differs by age, and the rate of instantaneous partners differs by age. We expect the probability of having a partner of the same race/ethnicity to to differ by race/ethnicity, which the network model will capture, so this does not warrant further exploration. Parameterizing age mixing using the difference between the square root of ego and alter ages does not seem to fully account for differences in age mixing by ego age, and we will also explore differences in age mixing by race among main partnerships and by region among persistent partnerships. Partnership age appears to differ by ego age, but this is complicated due to censoring; younger men’s partnerships will be shorter because they started more recently. To better understand the underlying pattern and confirm that it aligns with our expectations, we will explore it using plots.

For epidemic model parameters, we will explore regional differences in the proportion never tested, age differences in the intertest interval, age and racial/ethnic differences in anal sex role, and age differences in condom use among main partners. For PrEP use, we will exlpore differences by all three nodal attributes.

Plots

The plots below further explore the associations highlighted in the above section.

Ongoing main partner by race/ethnicity, region, and their interaction

In the plots below, the error bars correspond to the 95% confidence interval (point estimate +/- 1.96*SE).

Decision: Code the model to represent heterogeneity by race and region as main effects, but don’t try to represent the interaction between them.

Ongoing persistent partners by age

Examination of the coefficients for age from models adjusting for race/ethnicity and region suggested that men aged 40-49 have a higher probability of having any persistent partners, two or more persistent partners, and persistent partners concurrent with main partners. The plots below exlore these age effects in more detail. Note these plots do not adjust for race/ethnicity or region, as differences on these factors were not found to be significant. The error bars correspond to the 95% confidence intervals (point estimate +/- 1.96*SE).

In the plots below, we define age as a binary variable for men aged 40-49 versus all other ages.

Decision: Men aged 40-49 appear to have a higher likelihood of having persistent partners, concurrent persistent partners, and concurrent persistent and main partners. To allow the model to represent these patterns, we will write code to define separate probabilities for men aged 40-49 of forming a persistent partnership, having concurrent persistent partnerships, and having a persistent partnership concurrent with a main partnership. Because this pattern by age doesn’t align with our understanding of MSM’s sexual behavior, in the base model we will set this probability to be the same as for men of other ages. However, because there was a strong signal in these data, we will run sensitivity analyses in which we define a higher probability for men aged 40 to 49.

Rate of instantaneous partnerships by age

First we will look at the overall distribution of instantaneous partners by single year of age, as shown in the left panel of figure @ref(fig:plots_explore_rate_inst_1) with a superimposed loess smooth curve. From this we see that there are some outliers - 4 respondents who reported >0.25 instantaneous partnerships per day. These men reported between 139 and 340 instantaneous partners in the past 12 months. The right panel shows the same plot but with these 4 outliers removed. This plot indicates that the rate of instantaneous partners increases with age through ~age 30, at which point it platteaus and begins to decline ~age 45.

Figure @ref(fig:plots_explore_rate_inst_2) groups ages into smaller bins for those aged 18-34 to get insight as to where the inflection point might be. The high point for ages 35-39 is largely driven by the outliers - removal of these outliers drops the point down in line with the others.

This plot also shows a roughly bell-shaped curve, similar to the fitted loess curve in the plot above. To further explore how the rate of instantaneous partners varies with age, we will look at the proportion of men in each age group that fall in each of the quartiles of the distribution of instantaneous partnerships identified in section @ref(rateinst_explore). The table below shows the cut-points for the quartiles and how many men fall in each group. The number in each quartile is not equal because many men’s reported rate fell on the cutpoint between quartiles. In the plot, the error bars indicate the 95% confidence interval on the point estimates.

## 
##                  0       (0, 0.00274] (0.00274, 0.00821] 
##                427                121                156 
##   (0.00821, 0.931] 
##                190

It looks like the proportion of men reporting zero instantaneous partners drops with age until ~age 30, then stabilizes and increases for men aged 50+. The proportion of men in the top quartile for the rate of instantaneous mirrors this trend, increasing and then dropping off again in the older ages. The proportion of men in the middle quartiles does not show any meaningful pattern with age.

In the next plots, we will look to see whether the distribution by age differs by momentary degree. These plots exclude the four outliers identified above. We first look at the rate of instantaneous partnerships by single year of age, with superimposed loess smooth curves.

These plots do not suggest any clear differences in the rate of instantaneous partnerships by age. To further examine the distribution of the rate of instantaneous partnerships by momentary degree, the plots below examine whether the percent of men in each quantile by age varies by main and persistent partnership status. Red indicates zero instantaneous partnerships, orange indicates a low rate, blue indicates a medium rate, and green indicates a high rate.

Decision: There seems to be an age effect on the rate of instantaneous partnerships, but it does not appear to vary meaningfully by main or persisent degree. As such, we will define the rate of instantaneous partnerships to vary by age but not by degree. To capture the age effect, we could include terms for centered age (age - mean age) and centered age squared, or we could define a transition matrix wherein men move through the quartiles of the distribution as they age. Steve, which approach seems better to you? Plot @ref(fig:rate_inst_fit) shows the rate of instantanoues partnerships by age with the lowess smooth (in blue) and the predicted rate from a regression model adjusting for centered age and centered age squared (red).

Age mixing by age and race

The left panel of figure @ref(fig:plots_explore_agemixing_1) shows a scatter plot of respondents ages against the ages reported for their partners in single years. In partnerships below the diagonal line, the ego is older than his partner; in partnerships above the line, the partner is older than the ego. The right panel plots the absolute value of the difference between ego and alter ages on the y-axis, the older of the two partners’ ages on the x-axis, and a superimposed loess smooth curve.

To inform how to parameterize age differences, plot @ref(fig:plots_explore_agemixing_2) shows the absolute difference between the square root of ego and alter ages by ego ego, and plot @ref(fig:plots_explore_agemixing_3) shows the absolute difference between the cube root of ego and alter ages by ego ego. In both plots, the x-axis shows the older of the two partner’s ages (the respondent’s age or the age he reported for his partner).

To explore variation in age mixing by race/ethnicity, plot @ref(fig:plots_explore_agemixing_4) shows age mixing by age group and race in terms of the mean absolute difference in age between the ego and his partner in each age group.

To explore variation in age mixing by region, plot @ref(fig:plots_explore_agemixing_5) shows age mixing by age group and region in terms of the mean absolute difference in age between the ego and his partner in each age group.

Decision: To capture the age heterogeneity in age mixing, we need to decide how to parameterize the age difference. From plots @ref(fig:plots_explore_agemixing_2) and @ref(fig:plots_explore_agemixing_3), taking the difference between the square root of ego and alter ages reduces the heterogeneity by age, but does not fully account for differences by ego age in the age difference with persistent partnerships. Taking the cube root does not lead to a noticeable improvement. How to proceed? Are these plots showing what we want them to?

In the regression analyses, age mixing differed by race/ethnicity only among main partnerships. In the plot, the patterns for non-Hispanic other and Hispanic respondents appear similar. Black men ages 18-24 report partners who are, on average, further in age from themselves than the other racial/ethnic groups. We see this for persistent partnerships as well However, the numbers are small: 3 black respondents aged 18-24 provided data on the age of main partners and only 1 provided data on the age of persistent partners. As such, these patterns should be interpreted with caution. Apart from these differences among men aged 18-24, the pattern otherwise appears similar by age. Additionally, interpretation of these data is made difficult by the fact that there are imbalances in reported age mixing (see @ref(mixing). In light of these issues, we will not stratify age mixing terms by race/ethnicity.

Age mixing differed by region only among persistent partnerships. Plot @ref(fig:plots_explore_agemixing_5) indicates that men ages 18-24 in Western Washington had larger absolute differences in age with their persistent partners, but this is based on a sample size of only 8 in this region and age group. The data otherwise do not indicate clear patterns by region, so we will not stratify age mixing by region.

Partnership age by respondent age

These plots show the mean partnership age in main and persistent partnerships, respectively, by respondent age.

Decision: The pattern by age for main partnerships is as we expect: partnership age increases as age increases. In part this may be due to the fact that younger men have not had the opportunity to accrue as much time with their current partners as older men. Due to this censoring effect, we will not try to represent this heterogeneity in the network model, but we will plot the age distribution produced by the model to check how well it matches this observed pattern.

Never testing for HIV by region

The regression analysis indicated that the probability of never having tested for HIV among men aged 40 and older differed by region. The plot below shows the proportion of men aged 40 to 59 who report never having tested by region, with error bars corresponding to the 95% confidence intervals on the estimates (1.96 times the standard error).

Decision: This plot indicates that the proportion of men who have never tested for HIV varies by region. We will stratify this parameter in the model by region.

Intertest interval by age

The first figure below looks at the mean and median intertest intervals by age group among men who have not used PrEP in the past 12 months. The second shows only the median to allow for a closer examination of the variability by age.

Decision: These data suggest that the intertest interval increases with age. The effect appears to be close to linear up to age 44, at which point it levels out. We will try to fit this with terms for centered age and centered age squared. Steve, what do you think? When we looked at this before, we just looked at the top plot, which doesn’t show the variability in the median because the scale for the y-axis is too big. Would this be hard to do?

Plot @ref(fig:iti_fit) below shows the median intertest interval by age with a superimposed line with predicted values from a linear model for median intertest interval that includes centered age and centered age squared terms. Does this seem like an okay fit?

Anal sex role by age and race/ethnicity

In the plots below, the error bars indicate the 95% confidence interval of the point estimate (calculated as 1.96 * sqrt(percent(1-percent)/n)).

Decision: These plots suggest that there is a higher probability of exclusive bottoming for younger men and a lower probability of topping, but that these probabilities converge by age 30 and remain roughly similar. Steve, can you think of a good way to capture this? It’s complicated because this has implications for age mixing since position is in the network model as an offset. Martina and I felt that, if there’s no easy way to represent this, we can let it go.

There is no clear difference in anal sex role by race/ethnicity, so we will not stratify on race/ethnicity.

Condom use by age

The plots below show the probability of condom use by respondent age in main partnerships. In regression analyses, condom use differed by age only among main partnerships, but in the plots below we will explore patterns among persistent and instantaneous partnerships as well.

Probability of condom use by respondent age in persistent partnerships:

Probability of condom use in instantaneous partnerships:

Decision: These plots indicate that there is a reasonably strong age effect for main and persistent partnerships. We will try to fit these patterns by including terms for centered age and centered age squared. However, this raises the question of how to handle age mixing: if a young man and an older man form a partnership, what will their condom use be? Steve, did you work on this with Kathryn? Did you decide to take the average in such cases?

The plots below superimpose predicted condom use from a regression model adjusting for centered age and centered age squared (red) and from a regression model additionally adjusting for centered age cubed (green). It looks like the model with the age cubed term fits the data for main partnerships slightly better, but it perhaps puts too much weight on the high reported condom use at the end of the age distribution. Similarly, the regression line from the model with the age cubed term may fit the persistent partner data slightly better except at the ends of the distribution. The regression models for instantaneous partnerships definitely do not fit the data well: the predictions for ages 18 to 25 are negative, and the data plotted with single years of age are much more scattered due to small cell counts. What do you make of this? How should I proceed? Note that these plots exclude men who have taken PrEP in the past 12 months, which we may decide not to do.

## Warning: Removed 374 rows containing missing values (geom_path).
## Warning: Removed 295 rows containing missing values (geom_path).

PrEP use by age, race, and region

The plots below illustrate the proportion of men for whom Washington State guidelines recommend PrEP and with whom guidelines recommend providers discuss PrEP who report current use by age, race, and region.

Decision: These plots show strong effects of age and region. The differences by race/ethnicity are not clear, but we will code the model to represent the main effects of all three given the interest in the impact of potential disparities by race/ethnicity, which we could explore in sensitivity analyses.


  1. In the WHPP survey, any diagnosis of syphilis was considered an indication for recommending PrEP, regardless of the stage of infection. We classified respondents in the “discuss PrEP” candidacy category if they did not meet the criteria for recommending PrEP and reported any of the following in the prior 12 months: condomless anal sex (CAS) with a partner not considered to be main or primary, CAS with a partner of unknown or positive HIV status, diagnosis of urethral gonorrhea or rectal chlamydia, or use of non-prescription injection drugs. Men also fit into this category if they reported any ongoing partnerships with HIV-positive persons who had been on ART for more than 6 months and were virologically suppressed. We did not measure whether men reported HIV-positive female partners with whom they are trying to become pregnant, completion of a course of post-exposure prophylaxis for non-occupational exposure, or seeking a prescription for PrEP.